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PROTEIN AND DNA CODING THEREFOR 

The present invention relates to a protein, capable of bioluminescence, cDNA coding 
therefor and their uses, inter alia, in diagnostics and therapy. In particular, this invention 
5 relates to the cloning and sequencing of cDNA coding for pholasin from the bivalve 
mollusc Pholas dactylus. 

The term 'bioluminescence' refers to the emission of light resulting from a chemical 
reaction within, or produced by, a living organism. The essential components to the 
10 chemical reaction are: an organic molecule, usually comprising a luciferin; oxygen or one 
of its metabolites; and an enzyme or luciferase that catalyses the oxidation of the luciferin. 
The chemiluminescent reaction responsible for bioluminescence may be represented as 
follows: 

15 Luciferin + 0 2 , or 0" 2 , or H 2 0 2 , or OH or OCT or OX* or l 0 2 (+ luciferase) > 

oxy luciferin + light. 

Up to three other substances may also be required to generate light or to generate light of 
the required colour and intensity. These are as follows: 

20 

(a) A cation, such as H + , Ca 2+ , Mg 2+ or a transition metal cation (eg Cu7Cu 2+ , 
Fe 2 7Fe 3+ , La 3+ andV 3+ ); 

(b) A co-factor such as NAD(P)H, FMN or ATP; and/or 

(c) A fluor as an energy transfer acceptor. 

25 

Five chemical families of luciferin are known: 

(a) Aldehydes (found in the freshwater limpet Latia, earthworms, and with FMN in 
bacteria); f 
30 (b) Imidazolopyrazines, which are the compounds most commonly responsible for 
bioluminescence in the sea (found in Sarcomastigophora, Cnidaria, Ctenophora, 
Annelida, Chaetognatha, some Arthropoda, some Mollusca and some Chordata), 

(c) Benzothiazoles (found in beetles such as fireflies and glow-worms); 
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v o) Linear tetrapyrroles (found in dinoflagellates, euphausiid shrimp and some fish); 
and 

(e) Flavins (found in bacteria, fungi, polychaete worms and some molluscs). 

5 Chemiluminescent reactions involving these Iuciferins may produce a glow or a flash with 
an emission of violet, blue, blue-green, green, yellow, orange or red light, or occasionally 
UV or IR light. The light emission may be linearly or circularly polarised. The luciferin or 
its product may also be detected and quantified by fluorescence or phosphorescence. As a 
chemical reaction is directly responsible for the light emission, it does not require exposure 
10 to UV, visible or IR light. However, some bioluminescent systems, such as that in the red 
organ of the deep sea fish Malacosteus, exhibit a photo-chemiluminescence, where light 
can trigger or enhance the chemiluminescent reaction. [Reference is directed to 
Chemiluminescence: Principles and Applications in Biology and Medicine, A K Campbell 
(1988), Horwood/VCH Chichester, Weinheim.] 

15 

In the case of some bioluminescent proteins, the luciferin is so tightly or covalently bound 
to the protein molecule that it does not diffuse away into the surrounding fluid as a result 
of the chemiluminescent reaction. In this case, the protein-luciferin complex is known as a 
photoprotein; and the protein itself is referred to as an apophotoprotein. Some 
20 bioluminescent proteins are proteins whose light emission or radiation depends on or may 
be altered by oxygen or one of its metabolites; these bioluminescent proteins are 
hereinafter referred to as 'bioluminescent oxidative indicator proteins' (BOIPs).BOIPS 
may thus be photoproteins or Iuciferin-luciferase systems. 

25 BOIPs, therefore, may be used to detect and quantify oxygen or one of its metabolites in 
individual cells, defined compartments of living cells such as the nucleus, whole organs and 
organisms - both animals and plants, including microbes such as viruses and bacteria and 
protozoa - as well as substances of biological interest such as substrates, metabolites, 
vitamins, drugs, intra- and extra-cellular signals, enzymes, antigens, antibodies and nucleic 

30 acids. Heretofore, it has only been known to employ native BOIPs extracellularly. 

The present invention therefore relates to a method for the detection and/or measurement 
of oxygen or one of its metabolites in live cells (intracellular), which method comprises 
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providing a BOIP, such as native or chemically- (or genetically-) modified BOIP or a 
'rainbow protein' based on such a BOIP, intracellular^ and thereafter detecting and/or 
quantifying light emission therefrom and/or changes in colour, intensity and/or polarisation 
of emission(s) therefrom. 

5 

Furthermore, it has now been found that, by sequencing the BOIP and identifying the 
cDNA encoding therefor, the recombinant BOIP can also be used in such a method, or 
chemically- or genetically-modified recombinant BOIP, or a 'rainbow protein' based on 
such a BOIP. For example, the bivalve mollusc Pholas dactylus has been shown to 

10 comprise a native photoprotein, which interacts with a luciferase, when they are secreted 
together by the mollusc to produce light when 0 2 or one of its metabolites is present. 
References to the Purification and Properties of Pholas Dactylus Luciferin and Luciferase 
can be found by Michelson in Methods in Enzymology LVII 385-406 (1978). References 
to detection of activation of neutrophils by detection of superoxide anion can be found by 

1 5 Roberts in Anal Biochem 160 139-148 (1987) and by Muller et al in J Biolum Chemilum 3 
105-113 (1989). The native photoprotein (known as pholasin) is made up of a 
glycosylated apoprotein (34kDa) with a small organic molecule, the luciferin, tightly bound 
to it. This luciferin (whose structure is unknown - Muller and Campbell in J Biolum 
Chemilum 5 25-30 (1990)) can be extracted from the protein moiety - the apopholasin - or 

20 from the organism by a standard treatment, such as mild acid. The pholasin may be 
collected from live molluscs found in sedimentary rocks at low water along the south coast 
of England from Plymouth to Folkestone and also along the French channel coast and in 
the Mediterranean. Further details may be obtained from marine fauna and the references 
cited herein. 

25 

We have surprisingly found that pholasin can generate light even without the presence of 
the corresponding luciferase by addition of oxygen metabolites such as 0 2 \ H 2 0 2 , OC1- or 
other oxyhalide anions, or organic peroxides, and certain organic solvents such as dimethyl 
sulphide (DMSO) or dimethyl formamide (DMF). 

30 

We have now identified the cDNA encoding for the (non-glycosylated) apoprotein of 
pholasin, which may also be called 'apopholasin'. Accordingly, the present invention 
therefore further provides an isolated, purified or recombinant nucleic acid sequence 
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comprising: 

(a) The apophotoprotein of pholasin (alternatively, 'apopholasin'); 

(b) A sequence substantially homologous to or that hybridises to sequence (a) under 
5 stringent conditions; or 

(c) A sequence substantially homologous to or that hybridises under stringent 
conditions to the sequence (a) or (b) but for the degeneracy of the genetic code; or 

(d) An oligonucleotide specific for any of the sequences (a), (b) or (c). 

10 The present invention will now be further described with reference to the accompanying 
Figures, in which: 

Figure 1 shows three different cDNAs encoding apopholasin, referred to as clones 40, 3 
and 5. Nucleotides in bold type show codons used for initiation and termination of 
1 5 translation; 

Figure 2 shows the three sequences of Figure 1 aligned to demonstrate the sequence 
similarity. This figure was generated by Clustal. Positions which are indicated with a star 
are identical in all three clones. The codons for the initiation and termination of translation 
20 are highlighted in bold; 

Figure 3 shows the oligonucleotides used for the complete sequencing of the positive 
clones. These were identified from the cDNA library; their positions in clone 40 are 
shown. Oligonucleotides are shown in bold type, portions of the flanking sequence of the 
25 Bluescript plasmid are shown in italic type; 

Figure 4 describes the protein sequence described by the DNA sequence coding for 
apopholasin and shows, in Figure 4A, the complete sequence of the positive clone 40 
identified from the Pholas dactylus light organ library. The first 20 amino acids at the N- 
30 terminus are a signal peptide, and this can either be retained or removed when generating 
the BIOP as described in this invention and, in Figure 4B, the cDNA coding for 
apopholasin with untranslated 5' and 3' ends. The untranslated regions are also shown; 
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Figure 5 describes the protein sequence for pholasin with (5B) and without (5A) the signal 
peptide; 



Figure 6 shows the sequence for apopholasin genomic DNA. Two gDNA clones were 
5 indentifed but no introns were found; the Figure shows an alignment of the cDNA from 
cDNA clone 40 and the gDNA amplified by both rTth DNA polymerase XL and BioXAct 
polymerase. The sequences of the PCR product and the inserts in pGEM T were aligned 
j with the sequence of the cDNA of clone 40 and were identical to this cDNA; 



10 Figure 7 describes the oligonucleotides used for screening and expression. Degenerate 
oligonucleotides for library screening are shown in Figure 7A; non-degenerate ones in 
Figure 7B; and oligonucleotides used for protein expression are shown in figure 7C; 

Figure 8 lists the main restriction sites in the DNA for engineering pholasin; and 

15 

Figure 9 is a schematic representation of Figure 8 mapped to the sequence of Figure 4A 
(translated region). 



Accordingly, the present invention provides recombinant DNA encoding the 
20 apophotoprotein apopholasin and comprising the nucleotide sequence of the sequence 
disclosed in Figure 4B. Three different cDNAs coding for apopholasin have been isolated, 
having differing non-coding regions, respectively disclosed in Figure 1. The genomic 
DNA (gDNA), which contains no introns, has been shown (Figure 6) to comprise the same 
basic sequence as the cDNA. 

25 

Pholasin is a glycoprotein having 11.1 glusamine, 9.8 fructose, 7.1 mannose and 5.2 
galactose residues. The cDNA for apopholasin has a molecular weight of 23,456 
compared to 34,600 of the pholasin extracted from Pholas. The difference in the 
molecular weights of native versus recombinant apopholasin is due to the glycosylation of 
30 the native protein and luciferin. The isoelectric point of the translated protein calculated 
by the ISOELECTRIC command of the GCG programme is at 3.84. The native protein 
has a lower isoelectric point (<3.5), probably due to the presence of bound sulphate. 
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The three clones (Figure 2) isolated from the library encode a unique protein (Figures 4 
and 5), which does not have the same amino acid sequence as any known protein in the 
SwissProt database. The present invention therefore not only provides cDNA and RNA 
coding for the protein, but also the recombinant protein per se, with or without 
glycosylation units. A comparison of segments of the pholasin protein sequence with the 
proteins in the SwissProt database identified several proteins with regions having a high 
sequence similarity to regions of the cloned protein. These included several proteins which 
interact with nucleotides (Table 1). 



Table 1 A comparison of sections of the sequence of the cloned protein with sections 

of proteins which interact with nucleotides. 



Protein 


Homologous region 
cloned protein 

homology (+ denotes a conserved amino acid) 
selected protein 


tRNA-spIicing endonuclease (3 
subunit 

Saccharomyces cerevisiae 
EC 3.1.27.9 


SLYDEDNNGVMDEGKVIPSETIE 
+L DEDNN + + G ++P E++E 
NLRDEDNNLLDENGDLLPLESLE 

LDQDVELDYTW 
LD DV DYTW 
LDHDVSKDYTW 


ATP- AMP transphosphorylase 
Cyprinus carpio 
EC 2.7.4.3 


VMDEGKVIPSETIEDDIKDCGLLDQDVELDY 
+M +G+++P +T+ D IKD + DV Y 
IMQKGELVPL DTVLDM I KDAM I AKADVS KG Y 


DNA primase 
Synechocystis sp. 
EC 2.7.7. - 


EEVQCAMNWTQANEYVFNVD 
+ +VQ M ++Q+ + +EN D 
DQVQSLMRFSQSKQI I FNFD 


purine permease 
Emericella nidulans 


VQCAMNWTQANEYV 
+ C+++WT+ N ++ 
IMCSVDWTRRNRFI 


DNA repair protein complementing 
XP-A cells homologue 
Drosophila melanogaster 


PDTVDEAEDTPSET 
PDT DE EDT + T 
PDTYDEEEDTYTHT 


ATP synthase p chain 
Peptococcus niger 
EC 3.6.1.34 


DTVDEAEDTPSET 
D +DEA + PSET 
DPIDEAGEVPSET 


DNA polymerase a 
Homo sapiens 


DEDNNGVMDEGKVIPSETIEDDIKD 
D+D G +++G+ I + +EDD D 
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EC 2.7.7.7 



DDDG I G YVEDGRE I FDDDLEDDALD 



Similarity was found between the Vargula luciferase and Renilla LBP, but no other 
bioluminescent protein. 

5 Sequence homology between the cloned protein and (a) Vargula luciferase (b) Renilla 
LBP. An area of high homology in all three proteins is in bold print. 



(a) 148 206 
10 GTIVVTVRVSLYDEDNNGVMDEGKVIPSETIEDDIKDCGLLD-QDVELDYTWTQNECDL 
V+VSL D+ + + T+ DID + V++ + + 

Y WNT WDVKVSLRDVES YTE VE KVT I RKQST VVDLI VDGKQ VKVGG VDVS I PY SSENTS I 
353 412 
(b) 

15 105 166 
STMPGTYMLMDVCATRDADDKCIEGTIWTVRVSLYDEDNNGVMDEGKVIPSETIEDDIKDC 

+ TR + VR+S+ + N+ K I 

AIKIAKL5AEKAEETRGFLRVADQLGLAPGVRISVEEAAVNATDSLLKMKAEEKAMAVIQSL 
41 104 

20 

Three potential glycosylation sites on the protein have the consensus triplet sequence Asn-Xaa- 
Ser/Thr (where Xaa can be any residue except proline). Thr 216 was identified as a potential 
site of O-linked glycosylation by a neural network which has been trained to identify this type of 
glycosylation. The amino acid sequence was also entered into a neural network which had been 
25 trained to identify eukaryotic signal peptides. This confirmed that the most likely cleavage site is 
between positions 20 and 21 (GSG-EE). 

Many families of proteins contain a "signature" sequence of amino acids. The sequence of the 
clones did not contain any of these signatures present in the PROSITE database. The amino 
30 acids from 170 to 185 correspond to the calcium binding consensus sequence 
PENQST]X|pENQST]X[DENQST]XpENQST]X[DENQST]XX[DENQST]. Thirteen 
potential phosphorylation sites were discovered that matched the consensus sequences for 
either the kinase phosphorylation site [RK](2)-x-[ST], the protein kinase C phosphorylation site 
[ST]-x[RK] or the casein kinase II phosphorylation site [ST]-x(2)-[DE]. 
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Three N -linked glycosylation sites were identified in the translated sequence of the clones A 
neural network has been trained to identify this type of glycosylation which identified Thr 216 
as a potential site of O-linked glycosylation. At least one of these sites must be glycosylated in 
5 the native protein in order to account for the presence of the sugar residues. A putative signal 
peptide region preceded the N terminus of the secreted protein (determined by amino acid 
sequencing and was identified as a signal peptide by a neural network). To confirm this result 
the protein sequence was searched with PSORT for motifs which would locate the cloned 
protein in a cellular compartment. The protein sequence did not contain any transmembrane 
10 regions or N-myristoylation patterns which would indicate the presence of a lipid anchor. No 
targeting or retention sequences were found for the nucleus, mitochondria, endoplasmic 
reticulum or peroxisome. 

The fact that the clones had some sequence similarity with proteins that interact with 
15 nucleotides may suggest that pholasin binds a co-factor as part of the chemiluminescent 
reaction. Beetle luciferases require ATP binding for chemiluminescent activity. There is no P- 
loop binding motif ((A,G)x4GK(S,T) or (A)x{4}GK(T)) in the amino acid sequence of these 
clones. However, not all ATP binding proteins contain this motif. Neither does the cloned 
protein contain the GXGXXG phosphate binding consensus sequence necessary for the binding 
20 of other co-factors such as nicotinamide adenine dinucleotide. 

The amino acid and sugar components of pholasin are not able to emit light at the wavelength 
of the native protein (490nm). This indicates that there must be a chromophore bound to the 
protein. There are, however, proteins in which the chromophore is composed of modified 

25 amino acid residues within the polypeptide. The best characterised of these is the green 
fluorescent protein (GFP). This has a chromophore which is a ring formed by the autocatalytic 
cyclisation of the residues Ser-dehydroTyr-Gly. The serine may be mutated to a threonine, 
which increases the amplitude of the emission at 488nm. Pholasin had no similar amino acid 
sequence. Putative luciferin binding regions have been identified for two bioluminescent 

30 chemistries. Aequorin has a putative coelenterazine binding region, which is also present in two 
sections of the Vargula hilgendorfii luciferase. The sequence of the cloned protein has no 
homology to the putative luciferin binding site of aequorin, but the region of the Vargula 
luciferase from residue 353 to 41 1 has some similarity, as does the LBP of Renilla reniformis, 
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which also binds an imidazolopyrazine. This may indicate that the chemistry of pholasin 
bioluminescence involves an imidazolopyrazine luciferin. However, the region of homology is 
very small. The beetle luciferases contain an area of low sequence homology which may bind 
the benzothiazole luciferin. This low homology may account for the different colours of beetle 
5 bioluminescence. used a luciferin analogue (2-(4-benzoylphenyl) thiazole-4-carboxylic acid 
which photoinactivated the luciferase active site of the firefly Photinus pyralis. This 
photoinactivation was directly linked to the degradation of a small peptide sequence HHGF 
(residues 244-257). This is therefore postulated as a luciferin-binding site. The cloned protein 
does not have any sequence homology with these putative binding regions. Two strongly 
1 0 conserved regions of amino acids have also been found in both the luciferase and the luciferin 
binding protein of the dinoflagellate Gonyaulax polyedra. These regions were compared to the 
cloned protein, but no sequence similarity was found. No sequence identity could be established 
between the bacterial luciferases and the cloned protein. 

15 Therefore, the present invention provides cloned apophotoprotein apopholasin (and the 
cDNA coding therefor), which has identical properties to native (but non-glycosylated) 
apopholasin with respect to molecular weight, amino acid composition, potential for 
glycosylation, its highly acidic pi and its cellular location. Hence, the present invention can 
further provide the corresponding BOIP or modified BOIP, according to standard 

20 methods. 

The corresponding BOIP is preparable by bringing the apophotoprotein pholasin into 
association with the luciferin, also using standard methods. Although the luciferin is 
tightly bound in the native pholasin BOEP, it has been found that it may not be the case in 

25 the recombinant pholasin BOIP; indeed the luciferin may be weakly bound or merely 
present with the apoprotein. For example, a methanol, aqueous, acidic or other extract of 
Pholas dactylus (whole organism or light organ dissected from the animal) containing the 
Muciferin', or the pure luciferin, is added to the solution, cell or organism. The luciferin 
associates with the apo-BOIP forming the photoprotein. The luciferin on the photoprotein 

30 then reacts with oxygen or one of its metabolites to produce light, in the presence or 
absence of the luciferase. The light emission may be detected, quantified, or imaged using 
a luminometer, photographic film or imaging camera, or by the naked eye. Alternatively, 
light emission may be generated spontaneously by intra- or extra-cellular metabolites 

C: wcm69.doc 



10 

reacting with the apo-BOIP. 

Although illustrated with respect to pholasin, the following may apply to any BOLP: the 
BOIP can be produced directly from native DNA, or from DNA engineered or amplified 
5 by the polymerase chain reaction. By this means, sites can be inserted within the protein 
by splitting the DNA into two or more pieces, or by adding DNA sequences to the 5' or 3' 
ends. For example, the DNA may be expressed in bacteria, yeast, an insect or human cell, 
or other suitable organism to produce protein which can be extracted and used. 

10 In this instance, the protein produced from the cloned DNA reacts with oxygen or a 
metabolite of oxygen, such as the superoxide anion (O2"), hydrogen peroxide (H2O2X 
hydroxy radical (OH ), an oxyhalide anion (OCT, OBr\ OF, OSCN"), nitric oxide (NO), an 
organic hydroperoxide or a radical ROO . The change in light emission enables the oxygen 
or metabolite(s) to be detected and quantified in live cells, organelles, or on the outer or 

1 5 inner surface of the plasma membrane, or within an organ of a live organism without the 
need to break them open or the need to separate bound and free fractions. This also 
enables an enzyme producing oxygen or one of its metabolites, such as chlorophyll, or 
enzymes such as oxidases and oxygenases which react directly with oxygen or one of its 
metabolites to attach oxygen to the substrate to be detected and quantified in live cells, 

20 organs and whole organisms, or extracts from any one of these. 

Also the BOIP can be made in vitro by transcription/translation in a cell lysate such as 
rabbit reticulocyte lysate or wheat germ extract containing RNA polymerase. The DNA 
for the BOIP is first engineered to contain an RNA polymerase promoter, such as T7, SP6; 

25 bacterial promoter(s), such as lac, ara or trp; or mammalian promoter(s), such as actin, 
myosin, myelin proteins, TK, MRT-V, SV40, CMV, RSV, metallothionine, antibody, G6P 
dehydrogenase, and can be amplified in vitro using the polymerase chain reaction. A poly- 
A tail may be added at the 3' end and a tissue specific promoter or enhancer sequence 
added to the 5' or 3' end of the DNA coding for the BOIP or modified BOIP, enabling it 

30 to be expressed specifically in a target cell, such as a myocardial cell or a cancer cell. The 
expression of the BOIP in the target cell is detected and quantified by light intensity, 
colour or polarisation, as previously mentioned. 
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The BOIP, or its DNA or RNA, may be incorporated into a live bacteria or eukaryotic cell 
using phage, virus, plasmid, calcium phosphate transfection, electroporation, liposome 
fusion, membrane pore forming proteins, micro-injection or DNA gun. Once inside cells 
or an appropriate extracellular environment, cell activation or injury will initiate or change 
5 the light emission from the BOIP. For example, expression in live organisms by micro- 
injection of protein, RNA or DNA, or by transgenic manipulation to produce a cell, such 
as a bacterial, microbial, animal or plant cell, eg a white blood cell, a heart cell, or a yeast, 
protozoan, fruit fly (Drosophila), nematode worm, polychaete worm, fish, human, mouse, 
rat, sheep, pig, horse or plant, which can generate its own light. 

10 

The BOIP can also be incorporated into a defined part of a live cell by chemical means or 
by genetically engineering the BOIP to contain a signal peptide, which locates the BOIP to 
the inner or outer surface of the plasma membrane or within a particular organelle, such as 

15 peroxisome, mitochondrion, chloroplast, tonoplast, endoplasmic, reticulum, Golgi 
apparatus, endosome, lysosome, secretory vesicle, nucleus, nucleolus, nuclear membrane, 
plasma membrane, proteosome, or gap junction, or structure such as membrane receptor 
ion channel microtubule, cytoskeleton, nuclear skeleton, nuclear receptor, mitotic spindle 
or microfilaments. The signal peptide, added either chemically or genetically, will normally 

20 target the normal or modified BOIP to a particular intra- or extra-cellular site. For 
example, the sequence MLSRLSLRLLSRYLL or part of cytochrome oxidase on the N- 
terminus will target the BOIP to the mitochondrion; KKSALLALMYVCPGKADKE or 
MLLPVPLLLGLLGLAA at the N-terminus will target the BOIP to the endoplasmic 
reticulum, a KDEL or HDEL or KEEL sequence at the C-terminus retaining it there. SKL 

25 at C-terminus targets BOIP to the peroxisome; PKKKRKV or an extension of this SV40 
large T-antigen signal will target it to the nucleus; and a palmitoylation and/or a 
myristoylation signal will target it to the plasma membrane. By coupling the BOIP to 
another protein that targets itself to a particular site, the BOIP can also be targeted there. 
For example, coupling the nuclear proteins nucleoplasmin or lamin B receptor to BOIP 

30 targets it to the nucleus; cytochrome oxidase at the N-terminus targets BOIP to the 
mitochondria; chlorophyll at the N-terminus targets BOIP to the chloroplast; a connexin 
at the N-terminus targets BOIP to the gap junction or plasma membrane; and SNAP 25 to 
the plasma membrane. 
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Other modifications to the apoprotein, BOEP, or nucleotides coding therefor include, but 
are not limited to: 

The apoprotein, such as apopholasin, may also be glycosylated, and used to detect and 
quantify secretion or movement of proteins through the secretory pathway. 

Nucleic acid coding for the BOIP when expressed inside a live cell may not only be 
modified but also regulated in this cell by gene expression, such as by promoters, 
enhancers or oncogenes. For example, the apoprotein, such as apopholasin, may be 
coupled to a gene regulator protein, such as a transcription factor, by genetic or chemical 
manipulation, such that the movement through a cell or of the regulator protein or its 
activity, can be detected or quantified. 

The BOIP or apoprotein, or its DNA may be linked to another protein or DNA used in 
therapy, such that the other protein or DNA can be detected in live cells or in a whole 
organism, eg a human. 

The apoprotein, such as apopholasin, can also engineered genetically or chemically to 
contain a site or sites which can be covalently modified by enzymes such as 
phosphorylation (including ser/thr, his and tyr kinases and phosphatases), 
tranglutamination, proteolysis, ADP ribosylation, gly-or glu-cosylation, halogenation, 
oxidation, methylation, palmitoylation, myristylation and farnesylation. 

The apoprotein, such as apopholasin, can be engineered genetically or chemically to 
contain an antigen or intracellular signal binding site, such as Ca 2+ , cyclic AMP, cyclic 
GMP, cyclic CMP, IP3, IP 4 , diacyl glycerol, ATP, ADP, AMP, GTP, or any oxy- or 
deoxy-ribonucleoside or nucleotide, a substrate, a drug, a nucleic and/or a gene regulator 
protein. 

The BOIP may also be converted to a rainbow protein by engineering a particular site such 
as described hereinabove into the BOIP, at the N- or C-terminus, or between a chimera of 
the BOIP and an energy transfer acceptor, such as GFP (wild type or any of the mutant 
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GFPs). This is known as chemiluminescence, bioluminescence or fluorescence resonance 
transfer (CRET, BRET or FRET, respectively). Conversion of the BOIP to a 'rainbow 
protein' may be effected by reaction with a cellular substance, modification genetically or 
chemically, or by linking the BOIP to a fluor, such as the green fluorescent protein or the 
5 red fluorescent protein in the deep sea fish Malacosieus. The result is a BOIP which 
changes its colour and/or intensity and/or polarisation of emission. The change in colour 
occurs by energy transfer, eg resonance transfer (CRET or FRET) or electron transfer. 

The initial (unmodified) BOIP may be the apophotoprotein, its DNA or RNA, from the 
10 bivalve mollusc Pholas dactylus, or it may be another BOIP, such as one from the mollusc 
Rocellaria grandis or the squid Ommastraphes, or earthworm luciferase, which produce 
light with oxygen metabolites in a way very similar to Pholas dactylus. 



The BOIP, apo-BOIP, or nucleic acid coding for it, whether modified or not, may 
15 therefore be used in a range of biology and investigations such as: 

(a) Detection, location and measurement of signals in substrates, such as live cells, 
organs or organisms, or in extracellular fluids; 

(b) Detection, location and measurement of oxygen and its metabolites in substrates, 
20 such as live cells, organs or organisms, or in extracellular fluids, water (sea and 

fresh), soil or the atmosphere; 

(c) Detection and location of normal cells such as microbes (protozoa, yeast, fungi, 
moulds, bacteria, viruses); 

(d) Detection and location of abnormal cells, such as cancer cells, hyperactive cells in 
25 rheumatoid arthritis and other inflammatory diseases, cells infected with a 

pathogen, such as a virus or other infectious agents, cells damaged by physical, 
chemical or biological attack, cells damaged by perfusion or reperfusion injury or 
cells damaged by oxygen or one of its metabolites; 

(e) Measurement and location of enzymes, particularly those producing oxygen or its 
30 metabolites, and other tumour reactions in cells or biological fluids; 

(f) DNA and RNA binding assays; 

(g) Immunoassay and other protein binding assays; 

(h) In genetic engineering, in the development of transgenic animals and plants, and 
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microbes; in horticulture; agriculture; medicine and veterinary medicine; and/or 
(i) in genetic entertainment by incorporation into light sticks, greeting cards or toys to 
produce light of various colour, intensities, oscillations, flashes and glows; or in 
comestibles, such as food, drinks, including beers, wines, spirits, colas and other 
5 soft drinks. 

Accordingly, the present invention further provides an apoprotein, such as pholasin 
apoprotein (or apopholasin) in both unglycosylated and glycosylated forms, and a BOIP 

10 thereof, such as pholasin, either alone (but excluding native proteins per se that have 
already been isolated, such as native pholasin per se) or in association with one or more of: 
a targeting or signal peptide; a glycosylate; a site capable of modification by an enzyme; 
an antigen or intracellular signal binding site; a promoter, an enhancer or an oncogene or 
a pharmacologically active substance; or the like. The present invention further provides a 

15 recombinant construct comprising a nucleic acid sequence encoding for any of these 
proteins; a vector containing a nucleic acid sequence encoding for any of these proteins; a 
host transformed by such vector; a live cell, such as bacterial, insect, eukaryotic, 
prokaryotic, archae or plant cells containing or expressing any of these proteins; and a 
rainbow protein, as described herein, together with a nucleic acid sequence encoding 

20 therefor. 

The present invention will now be illustrated with reference to the following non-limiting 
examples, in which the methodology referred to is known to those skilled in the art and/or 
may be carried out by analogy with reference to the protocols disclosed in the following 
25 references, the contents of which are herein incorporated by reference in their entirety: 
BOOKS 

1. Campbell, AK. (1988). Chemiluminescence: principles and applications in biology and 
medicine, pp608. Horwood/VCH, Chichester and Weinheira 

2. Campbell,AK. (1994). Rubicon: the fifth dimension of biology, pp 304. Duckworth, 
30 London. 

PAPERS 

1. Campbell,AK Patel,A. (1983) Biochem J. 216:185-194. A homogeneous immunoassay 
for cyclic nucleotides based on chemiluminescence energy transfer. 

C:\wcm69.doc 



# 

z. Roberts,PA Knight,J Campbell,AK. (1987) Anal. Biochem. 160:139-148. Pholasin - a 
bioluminescent indicator for detecting activation of single neutrophils. 

3. Mueller, T Davies,EV Campbell,AK. (1989) J. Biolum. Chemilum. 3:105-113. Pholasin 
Chemiluminescence Detects Mostly Superoxide Anion Released from Activated Human 

5 Neutrophils. 

4. MuellerJ Campbell,AK. (1989) J. Biolum. Chemilum. 5:25-30. The Chromophore of 
Pholasin: A Highly Luinescent Protein. 

5. Sala-Newby,G and CampbeII,AK (1991) Biochem.J. 279:727-732. Engineering a 
bioluminescent indicator for cyclic AMP dependent protein kinase . 

10 6. Campbell, AK, Trewavas, AJ and Knight, MR (1996). Calcium imaging shows 
differential sensitivity to cooling and communication in luminous transgenic plants. Cell 
Calcium 19: 211-218. 

7. Kendall, JM, Sala-Newby, G Badminton, M, Campbell AK and Rembold, CR (1996). 
Free Ca 2+ in the endoplasmic reticulum of living cells measured using aequorin as a 

15 pseudo luciferase. Biochem J 318:383-387 

8. Badminton, MN, Campbell, AK and Rembold, CR (1996). J.Biol. Chem. 271:31210- 
31214. Differential regulation of nuclear and cytosolic Ca 2+ in HeLa cells. 

9. Sala-Newby, GB, Taylor, KT, Badminton, MN, Rembold, CR and Campbell, AK 
(1998). Imaging bioluminescent indicators shows Ca 2+ and ATP permeability thresholds 

20 in, live cells attacked by complement. Immunology. 93:4:601-609 

Sala-Newby, GB, Kendall, JM, Jones, H, Taylor, KM, Badminton, MN, Llewellyn, DH 
and Campbell, AK (1999). Targeting bioluminescent proteins to defined compartments fo 
living cells. Methods in Enzymology Bioluminescence and Chemiluminescence. ed Ziegler, 
M and Baldwin, T. 
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EXAMPLE 1: Production of a BOIP in bacteria 

c or genomic DNA coding for apopholasin, with or without the cDNA coding for the signal 
peptide, is amplified by PCR with restriction sites such as BamHl at each end. The cDNA is 
5 run on an agarose gel and the full length DNA eluted and purified. The DNA is then cut with 
BamHl to generate sticky ends and ligated into an expression plasmid such as pET3a, which 
has been cut with BamHl also. After ligation the sealed plasmid is transformed into a standard 
E.coli K12 strain such as JM109, a colony picked off for a large plasmid preparation. After 
checking that the plasmid contains the correct sequence for apopholasin and is in the correct 

10 orientation the plasmid is then used to transform a standard expression strain of E.coli such as 
BL21(DE3) or other expression strain. A colony is picked off the agar plate and grown up for 
2h in standard LB broth. IPTG is added as inducer for a further 2h. Apopholasin can then be 
extracted by breaking the bacteria by lysozyme digestion or sonication in a standard salt 
medium such as 50mM HEPES pH 7 +/- ImM ascorbate. Since the apopholasin is 

1 5 unglycosylated it tends to aggregate and form inclusion bodies. These can be broken using 8M 
urea or guanidinium chloride and these then dialysed off. If the pH of PAGE gels is alkaline 
this also tends to allow aggregation of both the unglycosylated and glycosylated apo- and full 
pholasin. A signal peptide such P-lactamase signal will target the BOIP to the periplasmic 
space, resulting in the ability to secrete the expressed protein from the external fluid of the cells. 

20 

EXAMPLE 2: Production of a BOIP in insect cells 

c or genomic DNA coding for apopholasin is inserted into a plasmid suitable for 
conversion into baculovirus when transfected into insect cells. Since pholasin is secreted 

25 uj - ...olas itself there is a signal peptide at the N-terminus. Removal of this by PCR will 
allow cytosolic expression in insect cells. If the signal peptide is left on or changed for 
honey bee mellitin signal peptide, the apopholasin is secreted into the external medium. 
The virus containing the DNA for apopholasin is then purified and stored until required. 
An aliquot is then added to fresh insect cells and these incubated for 3-7 days. The 

30 apopholasin is then isolated from the supernatant if a signal peptide is used, or from the 
cells is not. The apopholasin can then be purified by ammonium sulphate precipitation, gel 
filtration and DEAE chromatography. The state of glycosylation can be assessed by 
running the protein on PAGE when the molecular weight is 34Kda. Removal of the 
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glycosylation by enzymes returns the protein to the size of apopholasin 23.5Kda. It can be 
stored frozen or freeze dried, and activated to form pholasin by addition of luciferin as 
described in Example 3 . 

5 Since the apopholasin tends to aggregate in the insect supernatant it is important to get the 
protein into non-aggregating buffer, e.g. 50mM HEPES pH 6, 1-1 OmM ascorbate, as soon 
as possible. 

Formation of pholasin can then be achieved as described in Example 3. 

10 

EXAMPLE 3: Generating pholasin and light emission 

To generate light the apopholasin must first be converted into pholasin with the luciferin. 
The luciferin can be extracted from native pholasin by mild acid, or by methanol, mild acid 
1 5 or alkaline treatment of light organs isolated from Pholas dactylus or the whole organism. 
After homogenisation the extract is centrifuged or filtered to remove particulate material. 
Further purification can be achieved by tic of hplc. The luciferin is best stored dry, but can 
be stored at -70°C. The intactness and concentration can be estimated by measuring the 
absorbance or fluorescence. The details are as follows: 

20 

(a) Isolation of the luciferin 

Four protocols (1-4) have been developed to extract and isolate the luciferin 
responsible for light emission in pholasin. The luciferin is a small organic moiety 
tightly bound to apopholasin when pholasin is isolated from Pholas dactylus, but 
25 also can be found not bound to apopholasin. Thus the extraction procedure isolates 

either form of luciferin. 

1 . The organism Pholas dactylus or its light organs are homogenised in 5 OmM 
sodium phosphate pH 6.0 on ice. The pholasin is precipitated with saturated 
ammonium sulphate (4°C stirred), and then removed by centrifugation at ca 
30 1 5,000g for 30min in the cold. The supernatant is then passed down a SEP- 

PAK silica column, which binds the luciferin. The column is washed with 
5ml ethyl acetate and then 5ml of methanol. The active fractions containing 
the luciferin are assayed either by reactivation of the apopholasin or by 
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chemiluminescence in DMSO, DMF, or NaOCl. The luciferin is 
concentrated and can be purified further on tic or hpic with a standard 
solvent. It is dried and stored at -70°C. 

2. The organism Pholas dactylus or its light organs are homogenised in cold 
5 acetone on ice, filtered through a Buchner funnel, and extracted with 

methanol: acetone (1:1), the residual powder being extracted 3 times with 
methanol and extracts combined. These are then concentrated in a 
Rotavaporator and left to stand for lh on ice to allow further precipitation. 
The suspension is then refiltered and concentrated. The solution containing 

10 the luciferin is then passed down a SEP-PAK silica column which binds the 

luciferin. The column is washed with 5ml ethyl acetate and then 5ml of 
methanol. The active fractions containing the luciferin are assayed either by 
reactivation of the apopholasin or by chemiluminescence in DMSO, DMF, 
or NaOCl. The luciferin is concentrated and can be purified further on tic or 

1 5 hplc with a standard solvent. It can be dried and stored at -70 Q C. 

3 . The organism Pholas dactylus or its light organs are homogenised in cold 
acetone on ice, and filtered through a Buchner funnel to give an acetone 
powder. This is then extracted with methanol: acetone (1:1), twice for 
lOmin and then 3 times with methanol. The extracts are combined and 

20 concentrated in a Rotavaporator. They are left to stand for lh on ice to 

allow further precipitation, refiltered and concentrated. The residual 
powder is resuspended in 50mM sodium phosphate pH 6.0, lOmM 
ascorbate, and ultrafiltered with a lOkD Amicon membrane at 4 C for 
pholasin. The solution containing the luciferin is then passed down a SEP- 

25 PAK silica column which binds the luciferin. The column is washed with 

5ml ethyl acetate and then 5ml of methanol. The active fractions containing 
the luciferin assayed either by reactivation of the apopholasin or by 
chemiluminescence in DMSO, DMF, or NaOCl. The luciferin is 
concentrated and can be purified further on tic or hplc with a standard 

30 solvent. It is dried and stored at -70°C. 

4. The organism Pholas dactylus or its light organs are homogenised in 50mM 
HEPES buffer, with methanol and lOOmM HC1 on ice, and incubated for 2h 
on ice. After centrifugation at ca 15,000g for 30min in the cold, the 
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supernatant is then passed down a SEP-PAK silica column which binds the 
luciferin. The column is washed with 5ml ethyl acetate and then 5ml of 
methanol. The active fractions containing the luciferin are assayed either by 
reactivation of the apopholasin or by chemiluminescence in DMSO, DMF, 
5 or NaOCL The luciferin is concentrated and can be purified further on tic 

with a standard solvent. It is dried and stored at -70°C 
Method 4 normally generates most luciferin. The luciferin is characterised by its 
absorbance and fluorescence spectrum, and by its chemiluminescence with DMSO, 
NaOCl and apopholasin. 
10 (b) Generation of pholasin from apopholasin and the luciferin 

A small sample of the luciferin (1-1 Oul) is added to apopholasin in an appropriate 
bufTer (50mM HEPES pH 6-7.5, +/- 1-lOmM ascorbate, or 500mM NaCl, lOmM 
TES, ImM EDTA, ImM mercaptoethanol pH 6-7.5). The mixture is incubated at 
room temperature for up to 24h, and the pholasin assayed by adding an oxygen 
15 metabolite, e.g. NaOCl, or luciferase to a sample. When apopholasin has been 

expressed in cells, the luciferin is added externally, microinjected into individual 
cells or added via liposomes to get the luciferin into the cell. 

Light is detected and quantified in a standard luminometer, imaging camera 
20 (intensified or CCD), or by a silicon chip. 

EXAMPLE 4: Production of a BOIP in vitro 

c or genomic DNA coding for apopholasin, with or without the signal peptide, is amplified 
25 by PCR with the 5' primer containing the DNA coding for T7 RNA polymerase. The 
DNA product is purified and precipitated. After dissolving in lOmM tris/lmMEDTA pH7, 
the DNA is added to a standard in vitro transcription/translation system such as rabbit 
reticulocyte lysate or wheat germ agglutinin and incubated at 30°C for 30-60min. The 
apopholasin can then be purified and activated to form pholasin as described in Example 3. 
30 EXAMPLE 5: Targeting a BOIP in vitro 

The BOIP can also be incorporated into a defined part of a live cell by chemical means or 
by genetically engineering the BOIP to contain a signal peptide which locates the BOIP to 
the inner or outer surface of the plasma membrane or within a particular organelle such as 
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peroxisome, mitochondrion, chloroplast, tonoplast, endoplasmic reticulum, Golgi, 
endosome, lysosome, secretory vesicle, nucleus, nucleolus, proteosome, or gap junction, 
or structure such as microtubule, cytoskeleton, nuclear skeleton, nuclear receptor, or 
mitotic spindle. The signal peptide, added either chemically or genetically, will normally 
target the normal or altered BOIP to a particular intra- or extra-cellular site for example, 
the sequence MLSRLSLRLLSRYLL or part of cytochrome oxidase on the N-terminus 
will target the BOIP to the mitochondrion; KKSALLALMYVCPGKADKE or 
MLLPVPLLLGLLGLAA or the ER protein calreticulin at the N-terminus will target the 
BOIP to the endoplasmic reticulum, a KDEL or HDEL sequence at the C-terminus 
retaining it there. SKL at C-terminus targets BOEP to the peroxisome, PKKKRKV or an 
extension of this SV40 large T-antigen signal will target it to the nucleus, and a 
palmitoylation and/or a myristoylation signal (MGCVCSSNPD = the LCK N-terminal 
acylation motif from tyrosine kinase) will target it to the plasma membrane. By coupling 
the BOIP to another protein which targets itself to a particular site then the BOIP is also 
targeted here. For example, coupling the nuclear proteins nucleoplasmin or lamin B 
receptor to BOEP targets it to the nucleus; cytochrome oxidase at the N-terminus targets 
BOIP to the mitochondria; chlorophyll at the N-terminus targets BOIP to the chloroplast; 
and a connexin at the N-terminus targets BOIP to the gap junction or plasma membrane, 
SNAP 25 to the plasma membrane. 

In order to target pholasin to defined sites in living cells, the DNA coding for these 
targeting sequences are added by using PCR. For cytosolic apopholasin the native signal 
peptide is removed and also the BOIP can be linked to larger proteins at the N- or C- 
terminus such as firefly luciferase or aequorin to prevent it getting into the nucleus. This 
also enables ATP and oxygen metabolites, or Ca 2+ and oxygen metabolites to be measured 
simultaneously in the same cells by intensity, colour or polarisation of the different 
bioluminescent indicators. A multiple bioluminescent indicator can also be engineered by 
PCR, or by using restriction enzyme sites, from the DNA coding for 3 or more 
bioluminescent proteins. A simple screen of the transformed bacteria enables the multiple 
rainbow protein to be isolated with 2-3 colours or more. 
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The DNA is then added to an in vitro transcription/translation system as described in 
Example 4 in the presence of the organelle to be targeted (e.g. microsomes for the 
endoplasmic reticulum, which glycosylate apopholasin). 

5 The new DNA can also be inserted into a plasmid by standard techniques, and transformed 
into bacteria or transfected or injected into eukaryotic cells such as HeLa or COS. 

Addition of the luciferin as described in Example 3 allows formation of pholasin which can 
then be detected by light emission. Changes in oxygen metabolite production are then be 
10 detected by a luminometer or imaging camera when the cells are exposed to external 
oxygen metabolites, a change in oxygen concentration, addition of stimuli e.g. TNF, EGF, 
hormones or drugs, or attack by pathogens such as bacteria, viruses, complement, 
antibodies, toxins, and cells of the immune system. 



15 EXAMPLE 6: Engineering a covalent modification site into a BOIP 

(a) The site coding a protein kinase A (RRAS or kemptide), protein kinase C 
(MARCKS), MAP kinase, ERK, the ER - nuclear signalling kinase IRE IP or a 
phosphatase is added to the N- or C-terminus or inserted at various sites within the 
20 apopholasin by PCR and expressed as described in Examples 1-5. Pholasin is then 

generated by addition of the luciferin as described in Example 3. 

Addition of the catalytic subunit for protein kinase A, or activation via cyclic AMP 
inside cells, leads to phosphorylation or dephosphorylation of the modified pholasin 
25 and change in light emission (intensity, colour or polarisation). 

A preliminary screen is necessary to select the appropriate proteins and to discard 
any which have lost all activity. 

30 (b) The site coding a protease (thrombin, enterokinase, HIV protease, caspase) is 
added to the N- or C-terminus of the apopholasin by PCR or inserted at various 
sites within the protein, and expressed as described in Examples 1-5. Pholasin is 
then generated by addition of the luciferin as described in Example 3. 
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EXAMPLE 7: Engineering a BOIP into a "Rainbow Protein" 

cDNA coding for apopholasin is linked to another protein by using the cDNA coding for 
5 that protein. For example, wild type GFP, the S65T mutant of the green fluorescent 
protein, YGFP, or EGFP are linked to apopholasin by PGR at the N- or C-terminus, or by 
splitting one or both proteins using multi-step PCR. In between there is a 'reactive' 
peptide with a protease site (a thrombin or enterokinase) and a binding site for IP3, or the 
15 amino acid sequence form IP3 kinase (an tP 4 binding site). At the C-terminus of the 

10 GFP, a peptide containing 6 lysine residues may also be added via PCR. The protein is 
expressed and fluorescein covalently linked to these lysines by addition of fluorescein 
isothiocyanate. Addition of the luciferin forms pholasin as described in Example 3. The 
change in colour occurs by chemiluminescence resonance energy transfer. Without 
fluorescein the rainbow protein emits blue-green light (508nm), which changes to blue 

15 (490nm) when the reactive substance binds to the reactive peptide, or when either 
thrombin or enterokinase is added. When the 6 amino acid linker is used the colour starts 
as green (530nm), and will then change from green, to blue-green and then blue as the 
particular reactive sequence binds their respective analytes. Use of rhodamine instead of 
fluorescein generates a rainbow protein which changes from red to green to blue. 

20 

A preliminary screen is necessary to select the appropriate rainbow proteins and to discard 
any which have lost all activity. 

The other protein linked to apopholasin may be, for example, any one of the following 
25 linked chemically or genetically: 

1 . Firefly or any benzothiazole luciferase to the N or C terminus gives two colours for 
ATP and oxygen metabolites. 

2. Any imidazolopyrazine luciferase, including coelenterazine systems - decapod 
30 shrimp, fish, sqiud, Renilla, anthzoan, Chaetognate, radiolarian, or copepod and 

Vargula systems - ostracod, Porichihys and similar fish, cypridinids and Vargula. 

3 . Any tetrapyrrole luciferase such as dinoflagellate, euphausiid or stomiatoid fish. 
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4. Bacterial luciferase and other aldehyde or flavin luciferases, including polychate 
worm. 

5. Any GFP, including wild type, S65T, enhanced GFP, blue GFP, yellow GFP, 
Renilla GFP, Pilocarpus GFP, and Pennatula GFP, any anthozan GFP, or any 

5 coelenterate GFP. 

6. The red fluorescent proteins from stomiatoid fish - Malactosteus, Aristostomias, 
Photostomias. 

7. The phycobiliproteins - phycoerythrin and phycocyanobilin. 

8. The blue fluorescent lumazine protein in the bacterium Photobacterium . 
10 9. The yellow flavin fluorescent protein in Y Vibrio. 

10. Any lysine or argininine or other amino acid side chain where a fluor can be added 
covalently. In which the case the rainbow protein amy emiot more than two 
colours. For example, rhodamine on a pholasin-linker-GFP chimera will turn from 
red to green to blue. 

15 

A preliminary screen may be necessary to select chimeras which have not lost all 
bioluminescent activity. 

The 'reactive' peptide may be a binding site for any analyte, protein or DNA, metabolite, 
20 substrate vitamin, an enzyme such a protein kinase C or phosphatase, ion channel, ion 
pump, antigen, antibody, nucleotide or nucleoside such as ATP, GTP, ADP, AMP, 
adenosine, cAMP, cGMP, cCCP or their deoxy equivalents, and inositol phosphates such 
as IP3 or IP4, a lipid such as diacyl glycerol, phosphatidyl inositol bisphosphate, phosphate, 
a cation such as Ca 2+ , K + or Na\ Cu 2+ or 2n 2+ , or anion such as CI" , sulphate, or gas such 
25 as NO, 0 2 or H 2 , or a protein binding site such as calmodulin, kinesin, dynein, tubulin, or 
myosin. 

When pholasin is triggered by oxygen metabolites, the Pholas luciferase or peroxidase, 
energy transfer occurs from pholasin oxyluciferin through GFP to fluorescein resulting in a 
30 yellow emission. Addition of thrombin for 3h cleaves the GFP-fluorescein from the 
pholasin and the light emission returns to the blue of native pholasin. Addition of IP3 to 
the full chimera alters the efficiency of energy transfer. As a result there is a change in the 
ratio of light emitted in the yellow to blue. This ratio is directly related and can be plotted 
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against the concentration or amount of analyte. The light is detected in a dual wavelength 
luminometer or ratiometric imaging camera and the ratio of blue to green light measured. 

Alternatively any fluors can be used, and any binding sites with the right characteristics as 
shown in these examples will work provided a simple screen is used to select the right 
chimeras. 

EXAMPLE 8: Engineering a BOIP into a "Rainbow Protein" for two analytes 
together 

Apopholasin is linked to firefly luciferase by using cDNAs and PCR, followed by 
expression in insect cells as described in Example 2. Addition of the luciferin as described 
in Example 3 generates the pholasin. In the presence of firefly luciferin (ImM), ATP and 
oxygen metabolites, this chimera emits blue and yellow simultaneously which can be 
independently measured by using a dual wavelength luminometer or imaging camera. 

EXAMPLE 9: Expression of BOIPs in mammalian cells 

Apopholasin, c or genomic, in an expression plasmid with the CMV promoter, is 
transfected into HeLa cells. After incubation for 3 days to allow expression of the 
apopholasin, the luciferin is added to form pholasin. Expression is checked using a 
polyclonal antibody to pholasin raised in rabbits. Addition of oxygen metabolites outside 
the cell allows the permeability of the plasma membrane to oxygen metabolites to be 
assessed. As the oxygen metabolites permeate into the cytosol, the light emission 
increases. 
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EXAMPLE 10: Expression of BODPs in plants 



c or genomic DNA coding for apopholasin is inserted into a plasmid with the cauliflower 
mosaic virus promoter and transformed into Agrobactehum tumifwans. These are then 
5 added to a tobacco leaf, seedlings generated, and those expressing apopholasin selected. 
The plants are grown to seed, and seedlings grown from this seed. Addition of luciferin 
forms the pholasin as described in Example 3. Stressing the plant, e.g. with wind, touch, 
cold, or peroxide, or during growth and development or by a hormone, generates light, 
showing the formation of oxygen metabolites within the live plant. A cell-specific promoter 
10 engineered on to the apopholasin cDNA before making the transgenic plant enables 
oxygen metabolites to be detected in specific cells within the whole, living plant. 

EXAMPLE 11: Detection of oxidative damage in vitro 

15 Addition of pholasin to serum or plasma from a rat, mouse or human enables oxygen 
metabolites to be detected and measured on addition of a drug or other substance of 
interest. 

EXAMPLE 12: Detection of ROMs in a heart cells 

20 

Reperfusion has been proposed to lead to oxygen metabolite damage in cardiac myocytes. 
Pholasin allows this to be tested for the first time. Plasmid containing apopholasin cDNA 
and the CMV promoter is transfected into isolated cardiac myocytes in culture. 
Expression occurs within 1-3 days, and pholasin is formed by addition of the luciferin as 
25 described in Example 3. Subjecting the cells to hypoxia followed by readmission of normal 
oxygen leads to light emission, showing that oxygen metabolites have been generated 
inside the cells. By using an imaging camera, the digital or analogue nature of this can be 
assessed as the number of cells emitting light can be visualised and counted. 
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EXAMPLE 13: Detection of ROMs in the nucleus and endoplasmic reticulum (ER) 



Plasmid-containing apopholasin cDNA with either nucJeoplasmin DNA or calreticulin 
DNA (with or without KDEL on the C-terminus) linked to the pholasin DNA, to target the 
5 apopholasin to the nucleus or ER respectively, and the CMV promoter for expression, is 
transfected into HeLa cells in culture. Expression occurs within 1 -3 days, and pholasin is 
formed by addition of the luciferin as described in Example 3. Addition of oxygen 
metabolites outside the cells, or hypoxic/oxygen shock generates light measured in a 
luminometer, showing how fast oxygen metabolites penetrate into these organelles. By 
1 0 imaging with a photon counting imaging camera, the number of cells permeable to oxygen 
metabolites can be counted. Location of the pholasin can be assessed by imaging live cells, 
or by using immunofluorescence with the pholasin antibody on partially-fixed cells or GFP- 
pholasin in live cells. Using a rainbow protein, two or more analytes can be detected 
together. 



15 



EXAMPLE 14: Use of pholasin as a protein label 



Pholasin can be used as a label in homogeneous or heterogeneous immunoassay. 
Apopholasin is first covalently linked to an antibody to HIV, and pholasin formed by 
20 addition of luciferin as described in Example 3. The antibody is then used in a standard 
chemiluminometric immunoassay format. Addition of HIV antigen leads to an increase in 
antibody binding and an increase in light emission dependent on the amount of HIV added. 
The amount of HIV in a blood sample can be assessed by relating the pholasin light 
emission in the sample to the standard curve. 

25 

EXAMPLE 15: Pholasin as a DNA label 



Apopholasin is covalently linked to an oligonucleotide probe for detecting the presence of 
the cystic fibrosis gene. Addition of the probe to DNA in a standard Southern blot allows 
30 the probe to bind when the gene is present. Addition of luciferin as described in Example 
3 allows the pholasin to form. Addition of hypochlorite (lOmM) in barbitone buffer pH 9 
causes the pholasin to flash and the gene can be visualised by the photon counting imaging 
camera. 
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EXAMPLE 16: Phoiasin in a two hybrid system 

Protein-protein interaction can be detected by engineering apopholasin to one half of a two 
5 hybrid system and GFP to the other. Binding will allow the yeast to grow. 

EXAMPLE 17: Phoiasin in genetic entertainment 

Phoiasin is able to chemiluminesce at a wide range of pH (3-10), including acid pH such as 
10 3-4. Thus it can be added to drinks such as beer, cola, soft drinks, and spirits to make them 
glow. It can also make food glow by adding to them to the ingredients of cakes, icing, 
popcorn; by painting the phoiasin or apopholasin on to the food, or by making it 
genetically in the source of the food. It can be used in a wide range of toys and other 
entertaining devices including squirt guns, greeting cards, pens. 

15 

The rainbow proteins can also be used as an alternative to phoiasin alone, resulting in a 
rainbow of colours and colour changes. 

EXAMPLE 18: Phoiasin in trangenic animals 

20 

Transgenic animals such as nematodes, mice or plants can be generated from apopholasin 
cDNA by standard techniques. Injecting the luciferin or incubating whole plant in it forms 
the active phoiasin. Oxygen or its metabolites can then be detected, measured and imaged, 
in an intact organ, or from the whole organism. It can also be used in humans, in DNA 
25 therapy or diagnosis. 

EXAMPLE 19: Apoprotein from the luminous squid Ommastrophes 

The use of apoprotein from the luminous squid Ommastrophes is substituted for 
30 apopholasin, and the methods of Examples 1 to 18, above, are carried out. 

EXAMPLE 20: Apoprotein from the mollusc Rocellaria 
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The use of the apoprotein from the mollusc Rocellaria is substituted for apopholasin, and 
the methods of Examples 1 to 18, above, are carried out. 



EXAMPLE 21: Earthw rm iuciferase 

5 

The use of earthworm Iuciferase as a BOIP is substituted for apopholasin, and the methods 
of Examples 1 to 18, above, are carried out. 



10 EXAMPLE 22 



Genomic DNA from Pholas, Rocellaria, Ommastrophes, or earthworm is substituted for 
the recombinant protein in Examples 1 to 18, above, the methods of which are carried out 
in an analogous manner. 
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CLAIMS 



1 . An isolated, purified or recombinant nucleic acid sequence comprising: 

(a) a sequence that encodes the apophotoprotein of pholasin (alternatively, 
'apopholasin'); 

(b) a sequence substantially homologous to or that hybridises to sequence (a) under 
stringent conditions; or 

(c) a sequence substantially homologous to or that hybridises under stringent 
conditions to the sequence (a) or (b) but for the degeneracy of the genetic code; or 

(d) an oligonucleotide specific for any of the sequences (a), (b) or (c). 

2. A sequence according to claim 1, wherein the sequence that encodes for 
apopholasin is as shown in Figure 4B. 

3. A sequence according to claim 1, wherein the sequence that encodes for 
apopholasin is as shown in any one of Figuresl, 2, 3, 4A, 6 or 9. 

4. A sequence according to any preceding claim, wherein the apopholasin is non- 
glycosylated. 

5. A sequence according to any preceding claim, wherein the apopholasin is 
glycosylated. 

6. An isolated, purified or recombinant construct incorporating a sequence encoding 
apopholasin protein according to any preceding claim. 

7. An isolated, purified or recombinant construct incorporating a sequence encoding 
an apophotoprotein whose expression in a substrate, in association with a luciferin 
therefor, signals the presence of oxygen or an oxygen metabolite in the substrate. 

8. An isolated, purified or recombinant construct incorporating a sequence encoding 
an apophotoprotein whose expression in a substrate, in association with a luciferin 
therefor, signals the presence of oxygen or an oxygen metabolite in the absence of 
a corresponding luciferase in the substrate. 

9. A recombinant construct according to any one of claims 1 to 8, wherein the nucleic 
acid sequence is linked operably with nucleotides enabling expression and secretion 
of the apopholasin in a cellular host. 

10. DNA or RNA according to any of claims 1 to 9. 

11. An isolated, purified or recombinant polypeptide comprising apophotoprotein of 
pholasin (apopholasin) or a mutant or variant thereof having substantially the same 
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activity as apopholasin. 

An isolated, purified or recombinant polypeptide according to claim 1 1 comprising 
the amino acid sequence of Figure 4 or Figure 5. 

The apopholasin according to claim 1 1 or claim 12 when expressed by recombinant 
DNA or RNA according to claim 10. 

The apopholasin according to claim 13, which is non-glycosylated. 

A cell, plasmid, virus or live organism that has been genetically engineered to 

produce an apoprotein, said cell, plasmid, virus or live organism having 

incorporated expressibly therein a sequence according to any one of claims 1 to 10. 

A vector comprising a sequence according to any one of claims 1 to 10. 

A host cell transformed or transfected with a vector according to claim 16. 

A BOIP, as defined herein, comprising an apophotoprotein according to any one of 

claims 11 to 14 in association with a luciferin. 

A BOIP according to claim 18, wherein the luciferin is derived from Pholas 
dactylus. 

A method for the preparation of a BOIP, as defined herein, which method 
comprises bringing an apophotoprotein, such as recombinant apopholasin, into 
association with a luciferin therefor, such as a luciferin derived from Pholas 
dactylus. 

A BOEP, apophotoprotein thereof, or a nucleic acid sequence encoding either of 
these, which comprises a sequence according to any one of Figures 2 to 6 or 9 that 
has been chemically or genetically modified. 

A method for the detection and/or measurement of oxygen or one of its 
metabolites extracellularly, which method comprises providing a BOIP, such as 
native or chemically- or genetically- modified BOIP or a 'rainbow protein' based 
on such a BOIP, extracellularly and thereafter detecting and/or quantifying light 
emission therefrom and/or changes in colour, intensity and/or polarisation of 
emission(s), wherein the apophotoprotein comprises recombinant apopholasin. 

A method for the detection and/or measurement of oxygen or one of its 
metabolites in live cells (intracellularly), which method comprises providing a 
BOIP, such as native or chemically- or genetically- modified BOIP or a 'rainbow 
protein' based on such a BOIP, intracellularly and thereafter detecting and/or 

.doc 



31 

quantifying light emission therefrom and/or changes in colour, intensity and/or 
polarisation of emission(s) therefrom. 

24. A method according to claim 22 or 23, wherein said BOEP includes a signal 
peptide, targetting it to a pre-deterrnined extra- or intra- cellular site. 

25. A method according to claim 22 or claim 23, comprising incubating a test sample 
with a cell according to claim 15 or with a membrane preparation derived 
therefrom.vnvnwnbn 

26. A method according to any one of claims 22 to 24, wherein light emission takes 
place in the absence of a luciferase. 

27. The use of a sequence or a protein according to any one of claims 1 to 21 in the 
detection, diagnosis or measurement of oxygen or a metabolite thereof. 

28. A diagnostic kit incorporating a sequence or protein according to any one of claims 
1 to 21. 

29. A process for obtaining a substantially homologous source of apopholasin, which 
comprises culturing cells having incorporated expressibly therein a polynucleotide 
encoding apopholasin as defined in any one of claims 1 to 10, and thereafter 
recovering the cultured cells. 

30. A method, use or kit according to any one of claims 20 to 28, substantially as 
hereinbefore described with particular reference to the Examples. 
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FIGURE 1 
Clone 40: 

GAATTCGGCACGAGT CGGAAAAGAACAAAATGGCTT GT AT CGTTTT CGTT 
GCTCTTGTCGCTCTATGCTTAATGCAACCGGGTTCCGGTGAGGAAGTACA 
5 AT G C GC GAT GAATT G GAC ACAAGCT AAT GAAT ATGT GT T CAAC GTG GACT 
GGAT GAC C AT TT T CAT C T AC GAC T AT G G CG CT CAAGAGCAACT GTACGAA 
GATCGGGCTTTGGGGCTGTGTCGGATTGAACGGGCCGGCCCAGGTACCAC 
AAAAG C C GT CT G GAT TAAC T G GAGT AAC GACAC GCAGT CAT GT GT AACAA 
GAAAAACAATCTTCTTCGAGGTTGGTGGAGAAATTGCCCGGCTAGTTGAC 

1 0 T ACAGAC CACAG GAAGAC G GAACTGAGAAAACTTTTACAAGAAAATT CT C 
TAGCAAAATGCCAGGCACTTACATGCTTATGGACGTGTGCGCTACAAGGG 
ACGCT GAT GATAAAT GCATCGAAGGCACAATT GT GGT GACAGT CAGGGT G 
T C C CTAT AT GAC GAAGATAACAATG GTGT AAT GGAT GAAGGT AAGGT GAT 
TC CAT CT GAGACAAT CGAGGAT GATATCAAGGACT GTGGGCTCTTAGACC 

1 5 AAGAT GT T GAAC T CGATT AT AC GT GGAC T CAAAAC GAGT GT GAT CT ACCA 
GACACAGTAGAC GAG G C T GAAGACAC AC C GT CAGAAACT GGAGAATT CTT 
C T G GTAGAT C T AT C AGAC T AC TTTT AT CAGCAGGACAACT GGT C GT T AC C 
AGACAC CTAT AAC GT GT CCT CAT CAATAAT GT GTAAAACAGAAATAAT C G 
ATAGAATATT GAAAATAAAAT GTTAATAAACACTGGTT GAAATATGAAAA 

20 AAAAAAAAAAAAAACTCGAG 

Clone 3: 

GAA TTCGGCA CGAGGGAAAAGAACAAAATGGCTTGTATCGTTTTCGTT 
GCTCTTGTCGCTCTATGCTTAATGCAACCGGGTTCCGGTGAGGAAGTACA 
AT GC G C GAT GAAT TG GAC ACAAGCT AAT GAATAT GTGTT CAACGT GGACT 

25 GGAT GAC CAT T T T CAT CTAC GAC TAT GGC GCTCAAGAGCAACT GTAC GAG 
GATCGGGCTTTGGGGCTGTGTCGGATTGAACGGGCCGGCCCAGGTACCAC 
AAAAG CCG TCT G GAT TAAC T GGAGTAAC GACAC GCAGT CAT GT GTAACAA 
GAAAAACAAT C TT CTT CGAG GTT GGT GGAGAAATT GC C C GGCTAGTT GAC 
TACAGACCACAGGAAGACG GAAC TGAGAAAACTT TTACAAGAAAATTCT C 

30 TAG CAAAAT G C CAGG C ACT T ACAT G C T TAT G GAC GT GT GCGCT ACAAGGG 
ACGCTGATGATAAATGCATCGAAGGCACAATTGTGGTGACAGTCAGGGTG 
T C C CT ATAT GAC GAAGAT AACAAT GGTGTAATGGAT GAAGGTAAGGT TAT 
T C CATC TGAGACAATC GAG GATGATATCAAGGAC TGTGGGCTCTTAGACC 
AAGAT GTT GAACT CGATT AT ACGT GGACT CAAAACGAGT GT GATCTACCA 

3 5 GACACAGTAGAC GAG G CT GAAGACACAC C GT CAGAAACTGGAGAATT CTT 
CT G GTAGAT C TAT CAGAC CAC TTTT AT C AG CAG GACAACTGGT C GTTAC C 
AGACAC CTATAAC GT GTC CT CAT CAATAATGTGTAAAACAGAAATAAT CG 
AT AGAAT ATT GAAAAT AA 



40 Clone 5: 

GTCGGAAAAGAACAAAATGGCTTGTATCGTTTTCGTTGCTCTTGTCGCTCTATGCTTAATGCAACCGGGTTC 
C G GT GAGGAAG T AC AAT GC GC GAT GAATT G GACACAAG CT AAT GAATAT GT GTT CAAC GT GGACT GGAT GAC 
CATTTTCATCTACGACTATGGCGCTCAAGAGCAACTGTACGAGGATCGGGCTTTGGGGCTGTGTCGGATTGA 
AC GGGCCGGCC CAG GTAC CAC AAAAGC C GT CT G GAT TAACT G GAGTAAC GACAC GCAGT CAT GT GTAACAAG 

45 AAAAAC AAT CT T C TT C GAGGT TG GTG GAGAAATT GC C C GGCTAGTT GACTACAGAC CACAGGAAGAC GGAAC 
T GAGAAAACT T TT AC AAGAAAAT T CT C TAGCAAAAT GC CAG G CACT T ACAT GCTTAT G GAC GT GTG C GCT AC 
AAGGGACGCTGAT GATAAAT GCAT CGAAGGCACAATT GTGGT GACAGT CAGGGT GT CCCTAT AT GAC GAAGA 
T AACAAT G GT GT AAT GGAT GAAGGT AAG GTT ATT CCAT C T GAGACAAT C GAGGAT GAT AT C AAGGACT GTG G 
GCT CTTAGAC CAAGAT GTT GAACT CGATT AT AC GTG GACT CAAAAC GAGT GT GATCTAC CAGACACAGT AGA 

5 0 CGAGGCTGAAGACACACC GT CAGAAACT GGAGAATT CTT CTG GTAGAT CTAT CAGAC CACT TTT AT CAGCAG 
GACAAC T G GT C GT TAC CAGACAC CTATAAC GTGT CCT CAT CAATAAT GT GTAAAACAGAAATAAT C GAT AGA 
ATATT GAAAATAAAAT GT T AAT AGAC ACT G GTT GAAAAAAAAAAAAAAAAAAAA CTCGAG 
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clone 40 GAA TTCGGCA CGj^GT C GGAAAAGAACAAAATGG CT TGTATCGTTTT CG T T 

clone 3 GAAITCGGCACGAG — GGAAAAGAACAAAATGGCTTGTATCGTTTTCGTT 

clone 5 GTCGGAAAAGAACAAAATGGCTTGTATCGTTTTCGTT 

clone 4 0 GCTCTTGTCGCTCTATGCTTAATGCAACCGGGTTCCGGTGAGGAAGTACA 

clone 3 GCTCTTGTCGCTCTATGCTTAATGCAACCGGGTTCCGGTGAGGAAGTACA 

clone 5 GCTCTTGTCGCTCTATGCTTAATGCAACCGGGTTCCGGTGAGGAAGTACA 

JQ ************************************************** 

clone 40 AT G C G C GATGAATTGGACACAAGCTAAT GAATATGTGTT CAACGTGGACT 

clone 3 AT GCGCGATGAATTGGACACAAGCTAATGAATAT GTGTT CAACGTGGACT 

clone 5 AT GCGCGATGAATTGGACACAAGCTAAT GAATATGT GTTCAACGTGGACT 

clone 4 0 GGAT GACCATTTTCATCTAC GACTAT GGCGCT CAAGAGCAACTGTACGAA 

clone 3 GGAT GAC CAT TTT CAT CTAC GACTAT G GC GCT CAAGAG CAACTGT ACGAG 

clone 5 GGATGACCATTTTCATCTACGACTATGGCGCTCAAGAGCAACTGTACGAG 

2Q ************************************************* 

clone 40 GATCGGGCTTTGGGGCTGTGTCGGATTGAACGGGCCGGCCCAGGTACCAC 

clone 3 GATCGGGCTTTGGGGCTGTGTCGGATTGAACGGGCCGGCCCAGGTACCAC 

clone 5 GATCGGGCTTTGGGGCTGTGTCGGATTGAACGGGCCGGCCCAGGTACCAC 

25 ************************************************** 

clone 4 0 AAAAGC C GT CT GGATTAACT GGAGTAAC GACACG CAGT CAT GT GTAACAA 

clone 3 AAAAG C C GT CT GGATTAACTGGAGTAAC GACACG CAGT CATGT GTAACAA 

clone 5 AAAAGC C GT CT GGATTAACT GGAGTAACGACACGCAGT CAT GTGTAACAA 
************************************************** 

clone 4 0 GAAAAACAATCTTCTTCGAGGTTGGTGGAGAAATTGCCCGGCTAGTTGAC 

clone 3 GAAAAACAATCTTCTTCGAGGTTGGTGGAGAAATTGCCCGGCTAGTTGAC 

clone 5 GAAAAACAATCTTCTTCGAGGTTGGTGGAGAAATTGCCCGGCTAGTTGAC 

35 ************************************************** 

clone 4 0 TACAGACCACAGGAAGACGGAACTGAGAAAACTTTTACAAGAAAATTCTC 

clone 3 TACAGAC CACAGGAAGACGGAACTGAGAAAACTTT TACAAGAAAATTC T C 

clone 5 TACAGACCACAGGAAGACGGAACTGAGAAAACTTTTACAAGAAAATTCTC 

************************************************** 

clone '4 0 TAGCAAAAT G C CAGGCACTT ACAT GCTTATGGACGT GT GC GCTACAAGGG 

clone 3 TAGCAAAAT GCCAGGCACTTACATGCTTATGGACGTGTGC GCTACAAGGG 

clone 5 TAG CAAAAT GC CAGGCACTTACATGCTTATGGAC GT GT G C G CTACAAGGG 

45 ************************************************** 

clone 4 0 ACGCT GAT GAT AAAT GCAT CGAAGGCACAATT GT GGT GACAGT CAGGGT G 

clone 3 AC GCT GAT GATAAATGCATCGAAGGCACAATT GT G GT GACAGT CAGG GT G 

clone 5 AC G CT GAT GATAAAT GCAT C GAAG GCACAATT GT G GT GACAGT CAG G GT G 

clone 4 0 T C C CT AT AT GAC GAAGAT AACAAT GGT GT AAT G GAT GAAG GTAAGGT GAT 

clone 3 TCCCTATAT GAC GAAGAT AACAAT GGT GTAAT GGAT GAAGGTAAGGTTAT 

clone 5 TCCCTATAT GACGAAGATAACAATGGTGTAATGGAT GAAGGTAAGGTTAT 

55 *********************************************** + + 

clone 40 TCCATCTGAGACAATCGAGGATGATATCAAGGACTGTGGGCTCTTAGACC 

clone 3 T C CAT CT GAGACAATCGAGGAT GAT AT CAAG GACT GT GGGCT CT TAGAC C 

clone 5 TC CAT CT GAGACAAT CGAGGAT GAT AT CAAGGACTGTGGGCT CTTAGACC 

************************************************** 

clone 40 AAGAT GTT GAACT C GATTATACGTGGACTCAAAACGAGT GTGAT CTAC CA 

clone 3 AAGAT GTTGAACTCGATTATACGTGGACTCAAAACGAGTGTGATCTACCA 

clone 5 AAGAT GTT GAACTC GATTATAC GT GGAC T CAAAAC GAGT GT GAT CTAC C A 
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clone 4 0 GAC ACAGT AGAC GAGG C T GAAGACACAC C GT CAGAAACT GGAGAATT CT T 

clone 3 GAC ACAGT AGAC GAGG CT GAAGACACAC C GT CAGAAACT GGAGAATT CTT 

5 clone 5 GACACAGTAGAC GAG GCT GAAGACACAC CGT CAGAAACT GGAGAATT CTT 

clone 4 0 CT G GTAGAT C TAT CAGACTACTT TTAT CAGCAG GACAACT GGT C GT TAC C 

clone 3 CT G GTAGAT CTAT CAGAC CAC TT T TAT CAGCAG GACAACT GGT C GT TAC C 

10 clone 5 CT G GTAGAT CTAT CAGAC CACTTTTAT CAGCAGGACAACT G GT C GTTACC 

clone 40 AGACACCTATAAC GT GT C CT CAT CAATAAT GT GT AAAACAGAAAT AAT CG 

clone 3 AGACACCTATAAC GTGTCCTCAT CAATAATGTGT AAAACAGAAAT AAT CG 

15 clone 5 AGACACCTATAAC GTGTCCTCAT CAATAATGT GTAAAACAGAAATAAT CG 

*** + ** + + + ** + + * + + + * + *** + + *** + ** + ********** + 

clone 4 0 AT AGAAT AT T G AAAAT AAAAT GTT AAT AAACACT G GTT GAAAT AT GAAAA 

clone 3 AT AGAAT AT TG AAAAT AA 

20 clone 5 ATAGAAT ATT GAAAAT AAAAT GTTAATAGACACT GGTT GAAA AAA 

clone 4 0 AAAAAAAAAAAAAA CTCGAG 

clone 3 

25 clone 5 AAAAAAAAAAAAAA CTCGAG 
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FIGURE 3 



GAATTCGGCACGAGTCGGAAAAGARCPJUkATGGCTTGTATCGTTTTCGTTGCTC'TTG 

8S 

5 TCGCTCTATGCTTAATGCAACCGGGTTCCGGTGAGGAAGTACAATGCGCGATGAATT 
GG AC AC AAG C T AAT G AAT AT G T G T T CAAC G T GGAC T GGAT GAC CAT T T T CAT C T AC G 

AC TAT GGC G C T CAAGAG CAAC T GTACGAAGATCGGGCTTTGGGGCTGTGT C GGAT T G 

10 3A 

AACGGGCCGGCCCAGGTACCACAAAAGCCGTCTGGATTAACTGGAGTAACGACACGC 

AG T CAT G T G T AAC AAG AAAAAC AAT C T T C T T C GAGG T T GG T GGAGAAATTGCC CGGC 

4S 

1 5 TAGTTGACTACAGACCACAGGAAGACGGAACTGAGAAAAC T TT TACAAGAAAAT TC T 
CTAGCAAAATGCCAGGCACTTACATGCTTATGGACGTGTGCGCTACAAGGGACGCTG 

ATGATAAATGCATCGAAGGCACAATTGTGGTGACAGTCAGGGTGTCCCTATATGACG 

20 6A 

AAGATAACAAT G G T G TAAT G GAT GAAG G T AAG G T G T T C CAT C T GAGACAAT CGAG GA 

TGATATCAAGGAC TGTGGGC TC TTAGACCAAGATGTTGAACTCGAT TATACGTGGAC 
7S 

25 T CAAAAC GAG T GT GATC TACCAGACACAG TAGACGAG GC TGAAGACACACC G TCAGA 
AAC T G GAG AAT TCTTCTGG TAG AT C TAT C AG AC T AC T T T TAT CAGC AG GAC AAC T G G 
T C G T T AC C AG AC AC C TAT AAC G T G T C C T CAT CAAT AAT G T G T AAAAC AGAAAT AAT C 

30 

G AT AGAAT AT T G AAAAT AAAAT G T TAATAAACAC TGGT TGAAATATGAAAAAAAAAA 

5A 

AAAAAAAACTCGAG 
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Untranslated region 

GAA 7TCGGCACGAGTCGGAAAAGAACAAA 
5 Translated region 

ATG GCT TGT ATC GTT TTC GTT GCT CTT GTC GCT CTA TGC TTA ATG 45 
MAC I VFVALVALCLM 

CAA CCG GGT TCC GGT GAG GAA GTA CAA TGC GCG ATG AAT TGG ACA 90 
10QPGSGEEVQCAMNWT 

CAA GCT AAT GAA TAT GTG TTC AAC GTG GAC TGG ATG ACC ATT TTC 135 
QANEYVFNVDWMTI F 

15 ATC TAC GAC TAT GGC GCT CAA GAG CAA CTG TAC GAA GAT CGG GCT 180 
IYDYGAQEQLYEDRA 

TTG GGG CTG TGT CGG ATT GAA CGG GCC GGC CCA GGT ACC ACA AAA 225 
LGLCRI ERAGPGTTK 

20 

GCC GTC TGG ATT AAC TGG AGT AAC GAC ACG CAG TCA TGT GTA ACA 270 
AVWINWSNDTQS CVT 

AGA AAA ACA ATC TTC TTC GAG GTT GGT GGA GAA ATT GCC CGG CTA 315 
25RKTI FFEVGGEIARL 

GTT GAC TAC AGA CCA CAG GAA GAC GGA ACT GAG AAA ACT TTT ACA 360 
VDYRPQEDGTEKTFT 

30 AGA AAA TTC TCT AGC AAA ATG CCA GGC ACT TAC ATG CTT ATG GAC 4 05 

RKFS S KMPGTYMLMD 

GTG TGC GCT ACA AGG GAC GCT GAT GAT AAA TGC ATC GAA GGC ACA 450 
VCAT RDADDKC I EGT 

35 

ATT GTG GTG ACA GTC AGG GTG TCC CTA TAT GAC GAA GAT AAC AAT 495 
IVVTVRVSLYDEDNN 

GGT GTA ATG GAT GAA GGT AAG GTG ATT CCA TCT GAG ACA ATC GAG 540 
40 GVMDEGKVIPSETIE 

GAT GAT ATC AAG GAC TGT GGG CTC TTA GAC CAA GAT GTT GAA CTC 585 
DDIKDCGLLDQDVEL 

45 GAT TAT ACG TGG ACT CAA AAC GAG TGT GAT CTA CCA GAC ACA GTA 630 
DYTWTQNECDLPDTV 

GAC GAG GCT GAA GAC ACA CCG TCA GAA ACT GGA GAA TTC TTC TGG 675 
DEAEDT P SETGE FFW 

50 

TAG ATC TAT CAG ACT ACT TTT ATC AGC AGG ACA ACT GGT CGT TAC 720 
* 

CAG ACA CCT ATA ACG TGT CCT CAT CAA TAA 750 

55 

* = stop for translation 
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GAT 


GAA 


GGT 


AAG 


GTG 


ATT 


CCA 


TCT 


GAG 


ACA 


ATC 


GAG 


GAT 


GAT 


ATC 


AAG 


GAC 


TGT 


GGG 


CTC 


TTA 


GAC 


CAA 


GAT 


GTT 


GAA 


CTC 


GAT 


TAT 


ACG 


TGG 


ACT 


CAA 


AAC 


GAG 


TGT 


GAT 


CTA 


CCA 


GAC 


ACA 


GTA 


GAC 


GAG 


GCT 


GAA 


GAC 


ACA 


CCG 


TCA 


GAA 


ACT 


GGA 


GAA 


TTC 


TTC 


TGG 


TAG 







ATCTATCAGACTACTTTTATCAGC^G^CAACTGGTCGTTACCAGAC 



25 ACC TATAACGTGTCC TCATCAATAATGTGTAAAACAGAAATAATCGA 
TAGAATAT TGAAAATAAAAT G T TAATAAACAC TGGT TGAAATATGAA 
AAAAAAAAAAAAAAAAC TCGAG 

Xhol 
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EcoR I 

GAATTC GGCACGACTCGGAAAAGAACAAA 



ATG 


GCT 


TGT 


ATC 


GTT 


TTC 


GTT 


GCT 


CTT 


GTC 


GCT 


CTA 


TGC 


TTA 


ATG 


CAA 


CCG 


GGT 


TCC 


GGT 


GAG 


GAA 


GTA 


CAA 


TGC 


GCG 


ATG 


AAT 


TGG 


ACA 


CAA 


GCT 


AAT 


GAA 


TAT 


GTG 


TTC 


AAC 


GTG 


GAC 


TGG 


ATG 


ACC 


ATT 


TTC 


ATC 


TAC 


GAC 


TAT 


GGC 


GCT 


CAA 


GAG 


CAA 


CTG 


TAC 


GAA 


GAT 


CGG 


GCT 


TTG 


GGG 


CTG 


TGT 


CGG 


ATT 


GAA 


CGG 


GCC 


GGC 


CCA 


GGT 


ACC 


ACA 


AAA 


GCC 


GTC 


TGG 


ATT 


AAC 


TGG 


AGT 


AAC 


GAC 


ACG 


CAG 


TCA 


TGT 


GTA 


ACA 


AGA 


AAA 


ACA 


ATC 


TTC 


TTC 


GAG 


GTT 


GGT 


GGA 


GAA 


ATT 


GCC 


CGG 


CTA 


GTT 


GAC 


TAC 


AGA 


CCA 


CAG 


GAA 


GAC 


GGA 


ACT 


GAG 


AAA 


ACT 


TTT 


ACA 


AGA 


AAA 


TTC 


TCT 


AGC 


AAA 


ATG 


CCA 


GGC 


ACT 


TAC 


ATG 


CTT 


ATG 


GAC 


GTG 


TGC 


GCT 


ACA 


AGG 


GAC 


GCT 


GAT 


GAT 


AAA 


TGC 


ATC 


GAA 


GGC 


ACA 




GTG 


GTG 


ACA 


GTC 


AGG 


GTG 


TCC 


CTA 


TAT 


GAC 


GAA 


GAT 


AAC 


AAT 


GGT 


GTA 


ATG 


GAT 


GAA 


GGT 


AAG 


GTG 


ATT 


CCA 


TCT 


GAG 


ACA 


ATC 


GAG 


GAT 


GAT 


ATC 


AAG 


GAC 


TGT 


GGG 


CTC 


TTA 


GAC 


CAA 


GAT 


GTT 


GAA 


CTC 


GAT 


TAT 


ACG 


TGG 


ACT 


CAA 


AAC 


GAG 


TGT 


GAT 


CTA 


CCA 


GAC 


ACA 


GTA 


GAC 


GAG 


GCT 


GAA 


GAC 


ACA 


CCG 


TCA 


GAA 


ACT 


GGA 


GAA 


TTC 


TTC 


TGG 


TAG 







ATCTATCAGACTACTTTTATCAGCAGGACAACTGGTCGTTACCAGAC 
ACCTATAACGTGTCCTCATCAATAATGTGTAAAACAGAAATAATCGA 
TAGAATATTGAAAATAAAATGTTAATAAACACTGGTTGAAATATGAA 
AAAAAAAAAAAAAAAACTCGAG 



Xho I 



FIGURE 5A 



EEVQCAMNWTQANEYVFNVDWMTIFIYDYGAQEQLYEDRALGLCRIERAGPGTTKAVWIN 
WSNDTQSCVTRKTI FFEVGGEIARLVDYRPQEDGTEKTFTRKFSSKMPGTYMLMDVCATR 
5 DADDKCIEGTIWTVRVSLYDEDNNGVMDEGKVIPSETIEDDIKDCGLLDQDVELDYTWT 
QNECDLPDTVDEAEDTPSETGEFFW 



FIGURE 5B 

10 

MACIVFVALVALCLMQPGSGEEVQCAMNWTQANEYVFNVDWMTIFIYDYGAQEQLYEDRA 
LGLCRIERAGPGTTKAWINWSNDTQSCVTRKTIFFEVGGEIARLVDYRPQEDGTEKTFT 
RKFSSKMPGTYMLMDVCATRDADDKCIEGTIWTVRVSLYDEDNNGVMDEGKVIPSETIE 
DDIKDCGLLDQDVELDYTWTQNECDLPDTVDEAEDTPSETGEFFW 
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FIGURE 6 



clone 4 0 GAATTCGGCACGAGTCGGAAAAGAACAAAATGGCTTGTATCGTTTTCGTT 
BioXAct TGGCTTGTATCGTTTTCGTT 
5 zTth 



10 



clone 40 

BioXAct 

rTth 



GCTCTTGTCGCTCTATGCTTAATGCAACCGGGTTCCGGTGAGGAAGTACA 

GCTCTTGTCGCTCTATGCTTAATGCAACCGGGTTCCGGTGAGGAAGTACA 

TAT G CT T AAT G CAAC C GGGTT C C GGT GAG GAAGTACA 
************************************* 



15 



clone 40 

BioXAct 

rTth 



AT GCGCGAT GAATTGGACACAAGCTAAT GAATAT GT GTTCAACGTGGACT 
AT GC G C GAT GAATT GGAGACAAG CTAAT GAATAT GT GT T CAAC GT GGAC T 

ATGCGCGAT GAATTGGACACAAGCTAAT GAATAT GT GTT CAAC GTGGACT 
************************************************** 



20 



clone 40 

BioXAct 

rTth 



G GAT GAC CAT T T T CAT C T AC GACTAT GG C G CT CAAGAGCAAC T GT AC GAA 
GGAT GACCATTTT CAT CTACGACTAT GGCGCTCAAGAGCAACT GTACGAA 
G GAT GAC C AT TTT CAT CTAC GACTAT GGC G C T CAAGAGCAAC T GTAC GAA 



25 



clone 40 

BioXAct 

rTth 



GATCGGGCTTTGGGGCTGTGTCGGATTGAACGGGCCGGCCCAGGTACCAC 

GATCGGGCTTTGGGGCTGTGTCGGATTGAACGGGCCGGCCCAGGTACCAC 

GATCGGGCTTTGGGGCTGTGTCGGATTGAACGGGCCGGCCCAGGTACCAC 
************************************************** 



30 



clone 40 

BioXAct 

rTth 



AAAAG C C GT C T GGAT T AAC T G GAGTAAC GACAC GCAGT CAT GT GT AACAA 

AAAAG C C GT C T G GAT TAACT GGAGTAAC GACAC G CAGT CAT GT GTAACAA 

AAAAGCCGTCTGGATTAACTGGAGTAACGACACGCAGTCATGTGTAACAA 
**************************++************+********* 



35 



clone 40 

BioXAct 

rTth 



GAAAAACAATCTTCTTCGAGGTTGGTGGAGAAATTGCCCGGCTAGTTGAC 

GAAAAACAAT CTT CTT CGAGGTTGGTGGAGAAATTGCCCGGCTAGTT GAC 

GAAAAACAATCTT CTT CGAGGTTGGT GGAGAAATTGCCCGGCTAGTTGAC 
************************************************** 



40 



clone 40 

BioXAct 

rTth 



TACAGACCACAGGAAGACGGAACTGAGAAAACTTTTACAAGAAAATTCTC 

TACAGAC CACAGGAAGACGGAACT GAGAAAACTTTTACAAGAAAATT CT C 

TACAGACCACAGGAAGACGGAACTGAGAAAACTTTTACAAGAAAATTCTC 
************************************************** 



45 



clone 40 

BioXAct 

rTth 



TAGCAAAATGCCAGGCACTTACATGCTTATGGACGTGTGCGCTACAAGGG 

TAGCAAAATGCCAGGCACTTACATGCTTATGGACGTGTGCGCTACAAGGG 

TAG CAAAAT G C CAG G CACT TACAT GCTTAT GGAC GTGT GC G CT ACAAGG G 
************************************************** 



50 



clone 40 

BioXAct 

rTth 



ACGCTGATGATAAATGCATCGAAGGCACAATTGTGGTGACAGTCAGGGTG 

AC GC T GAT G ATAAAT G CAT C GAAGG CACAATT GT G GT GACAGT CAG G GT G 

AC G C T GAT G ATAAAT G CAT C GAAG G CACAAT T GT GGT GACAGT C AGG GT G 
************************************************** 



55 



clone 40 

BioXAct 

rTth 



T C C C TAT AT GAC GAAGAT AAC AAT GGT GTAAT GGAT GAAGGTAAGGT GAT 

T C C C TAT AT GAC GAAGAT AACAAT GGT GT AAT G GAT GAAG GTAAG GT GAT 

TCC CT ATAT GAC GAAGAT AACAAT GGT GTAAT GGAT GAAGGTAAGGT GAT 
************************************************** 



60 



clone 40 

BioXAct 

rTth 



T C CAT CT GAGACAAT C GAGGAT GAT AT CAAGGACTGTGGGCTCTTAGAC C 

TCCATCTGAGACAATCGAGGATGATATCAAGGACTGTGGGCTCTTAGACC 

TCCATCTGAGACAATCGAGGATGATATCAAGGACTGTGGGCTCTTAGACC 
************************************************** 
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+ 



clone 40 

BioXAct 

rTth 

5 

clone 40 

BioXAct 

rTth 

10 

clone 40 

BioXAct 

rTth 

15 

clone 40 

BioXAct 

rTth 

20 

clone 40 

BioXAct 

rTth 

25 

clone 40 

BioXAct 

rTth 



AAGAT GTT GAACT CGATTATACGT GGACTCAAAACGAGTGT GAT CTAC CA 
AAGAT GTTGAACT C GAT TAT AC GT GGACT CAAAACGAGTGT GAT CTACCA 
AAGAT GTT GAACT C GAT TAT AC GT GGACT CAAAACGAGT GTGAT CTACCA 

GACACAGTAGACGAGGCTGAAGACACACCGTCAGAAACTGGAGAATTCTT 
GACACAGTAGACGAGGCTGAAGACACACCGTCAGAAACTGGAGAATTCTT 
GACAC AGT AGAC GAG G CT GAAGACACAC C GT CAGAAAC T G GAGAAT T C T T 

CT GGTAGAT CTATCAGAC TACT TT TAT CAGCAGGACAACTGGT CGTTACC 
C T GGTAGAT C TAT CAGACT AC TT TTAT CAG CAGGACAACT G GT CGT TAC C 
C T G GTANAT CTAT CAGAC TACT TT TAT CAGCAGGACAAC T G GT CGTT AC C 
+ + + + + it****************************************** 

AGACAC C T ATAAC GT GT CCT CAT CAATAAT GT GTAAAACAGAAATAATC G 

AGACAC CTAT AACGT GT C CT CAT CAATAAT GT GT AAAAC AGAAAT AAT C G 

AGACAC CTATAAC GT GT C CT CAT CAATAAT GTGTAAAAC 
+ * + + + * + * + * + + + * + + + + 

AT AGAAT ATT GAAAATAAAAT GTT AATAAACAC T G GT T GAAAT AT GAAAA 
AT AGAAT AT T GAAAAT AAAAT GT TAAT AAACACT GGT T GAAAT AT GAA 



AAAAAAAAAAAAAA C TCGAG 
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FIGURE 7 A 



Oligo 1 

5 AC I ATH TTY TTY CAR GT 
Oligo 2 

CAR GAR GAR GGN AC I GA 

10 

Oligo 2A 

TCI GTN CCY TCY TCY TG 

15 Oligo N 

TTY AAY GTI GAY TGG ATG 

20 M=A/C R-A/G W=A/T S=G/C Y=C/T K-G/T 

V=A/C/G H-A/C/T D=A/G/T B=C/G/T N=A/C/G/T I=inosine 

25 FIGURE 7B 

Oligo 3 A 

ACA CAG CCC CAA AGC CCG AT 

30 

Oligo 4S 

TTG CCC GGC TAG TTG ACT AC 
35 Oligo 5 A 

CAT ATT TCA ACC AGT GTT TAT TAA 

Oligo 6A 

40 

CAA TTG TGC CTT CGA TGC A 
Oligo 7S 

45 GGA CTG TGG GCT CTT AG 
Oligo 8S 

ATG GCT TGT ATC GTT TTC GT 

50 
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+ 



Oligo T7 
FIGURE 7C 



12.(14 



Oligo ExS 

5 

CCA CAC GGA TCC TGA GGA AGT ACA ATG 
Oligo ExA 

10 CCA CAC GGA TCC TTA TTG ATG AGG ACA 
Oligo Bacl 

CTT GTT TTT ATG GTC GTC TAC ATT TCT TAC ATC TAT GCG GAG GAA 
15 GTA CAA TG 

Oligo C9 12 

CCA CAC AGA TCT AGA ATG AAA TTC TTA GTC AAC GTT GCC CTT GTT 
20 TTT ATG GTC 

Oligo BV5 

TTT ACT GTT TTC GTA ACA GTT TTG 

25 

Oligo BV3 

CAA CAA CGC ACA GAA TCT AG 
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* 111". 



FIGURE 8 





A _ _ T 

AccI 


630 










Afi in 


405 


7j4 






5 


Alul 


95 










AlWfNl 


03^ 










Asp 


/ Jo 


Z1Z3 








A T 

Asul 


204 










BanI 


215 








10 


T» TT 

Baim 


E S A 

564 










Bcnl 


51 


310 








BglH 


678 










Bspl 


Zoo 


504 








BstNI 


213 


384 






15 


BstUI 


77 










Ctrl 01 


206 










Cfrl3I 


204 


209 








Ddel 


345 


528 


565 






Dpnl 


174 


615 


/OA 

680 




20 


EcoRl 


665 










EcoRH 


211 


382 








EcoRV 


547 










Fokl 


136 


518 


554 






Ha en 


153 








25 


HaellJ 


i n ac 

206 


Z1U 








Hgal 


431 










Hhal 


77 


152 


41,5 






HincU 


319 










Hinfl 


520 


598 






30 


HinPlI 


75 


150 


411 






Hpan 


50 


57 


207 


310 




HphI 


71 


469 


529 






Kpnl 


219 










Mael 


314 


372 






35 


MaeH 


114 


405 


593 


734 




Maem 


245 


265 


457 


716 




MboH 


182 


274 


277 


347 




Mnll 


54 


ZoZ 


531 


OZ7 




Msel 


41 


LO f 






40 


Nael 


a O 

208 










Neil 


51 


Tift 

310 








NlalD 


264 


397 








NIalV 


55 


217 








Nsil 


440 








45 


NspHI 


397 










Plel 


592 










Rsal 


69 


167 


217 






Sau3AI 


172 


613 


678 





497 653 661 

750 
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Sau96I 


204 








ScrFI 


51 


213 


310 


384 


SfaNI 


428 








TaqI 


288 


441 


537 


585 


XhoD 


678 
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FIGURE 9 



NlalV 
Mnll 

5 ScrFI 

Neil 
Bcnl 

Msel Hpall Hpan 
I II (II 

10 ATGGC TTGT ATC GTTTTC GTTGCTCTTGTC GCTCTATGCTT AATGC AAC C GGGTTC C GGT 60 



Hhal 
HphI BstUI 
Rsal HinPlI Alul Maell 

II II I I 

G AGGAAGTAC AATGCGCGATG AATTGGAC AC AAGCTAATG AATATGTGTTC AACGTGGAC 1 20 



Haell 

Hhal DpnJ 
20 Fokl HinPlI Rsal Sau3AI 

I III I II 

TGGATGACC ATTTTC ATCTACGACTATGGCGCTC AAGAGC AACTGTACGAAGATCGGGCT 1 80 



Haelll 

25 Sau96I 

Cfrl3I 
Nad Rsal 
Hpan NlalV 
Haem BanI 
30 CfrlOI ScrFI 

Sau96I BstNI 
Cfrl3I EcoRE Kpnl 
MboII AsuIAsuI Asp718 Msel 

I lllllllllll I 

35 TTGGGGCTGTGTCGGATTGAACGGGCCGGCCCAGGTACCACAAAAGCCGTCTGGATTAAC 240 



Maein MboH 
Maelll Nlam MboII Mnll TaqI 

! II II I I 

40 TGGAGTAACGACACGCAGTCATGTGTAACAAGAAAAACAATCTTCTTCGAGGTTGGTGGA 300 



Mael 
ScrFI 
Neil 

45 Hpall Moon 

Bcnl HincII Ddel 
III II 

GAAATTGCCCGGCTAGTTGACTACAGACCACAGGAAGACGGAACTGAGAAAACTTTTACA 360 
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35 



40 



ScrFI 

BstNI NspHI Maell HinPlI 
Mael EcoRII Nlalll AflHI Hhal 

5 AGAAAATTCTcW 420 

Hgal TaqI 
SfaNl Nsil Maelll HphI 

10 GACGCTGATGATAAATGCATCGAAGGCACAATTGTGGTGACAGTCAG 480 

Mnll 
Hinfl HphI 
MboII Fokl Ddel TaqI 

15 GACGAAGATAACAATGGTGTAATGGATGAAGGTA 540 
Ddel 

Bspl286 Maell 
20 EcoRV Fokl Banll TaqI Plel Hinfl 

GATGATATCAA<X}ACTC^ 600 

Dpnl AccI 
25 Sau3AI Mull MboII AlwNI 

II I I II 

C AAAAC G AGTGTG ATCTAC C AGAC AC AGTAGACGAGGCTGAAGAC AC ACCGTC AGAAACT 660 

Dpnl 

30 XhoII 

EcoRI Sau3AI 
MboII Bgin Maem 

GGAGAATTcWcTGGTAGATCTATCAGACTACTTTTATCAGCAGGACAACTGGTCGTTAC 720 



Maell 

AflHI Mnll 

I I 

C AG AC AC CTATAACGTGTCCTC ATC AATAA 750 
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